Dimension Reduction and Classifier-Based Feature Selection for Oversampled Gene Expression Data and Cancer Classification

نویسندگان

چکیده

Gene expression data are usually known for having a large number of features. Usually, some these features irrelevant and redundant. However, in cases, all features, despite being numerous, show high importance contribute to the analysis. In similar fashion, gene sometimes have limited instances with rate imbalance among classes. This can limit exposure classification model different categories, thereby influencing performance model. this study, we proposed cancer detection approach that utilized preprocessing techniques such as oversampling, feature selection, models. The study used SVMSMOTE oversampling six examined datasets. Further, selection using dimension reduction methods classifier-based ranking selection. We trained machine learning algorithms, repeated 5-fold cross-validation on microarray algorithms differed based technique used.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Classification and Biomarker Genes Selection for Cancer Gene Expression Data Using Random Forest

Background & objective: Microarray and next generation sequencing (NGS) data are the important sources to find helpful molecular patterns. Also, the great number of gene expression data increases the challenge of how to identify the biomarkers associated with cancer. The random forest (RF) is used to effectively analyze the problems of large-p and smal...

متن کامل

Prediction of blood cancer using leukemia gene expression data and sparsity-based gene selection methods

Background: DNA microarray is a useful technology that simultaneously assesses the expression of thousands of genes. It can be utilized for the detection of cancer types and cancer biomarkers. This study aimed to predict blood cancer using leukemia gene expression data and a robust ℓ2,p-norm sparsity-based gene selection method. Materials and Methods: In this descriptive study, the microarray ...

متن کامل

Developing a Filter-Wrapper Feature Selection Method and its Application in Dimension Reduction of Gen Expression

Nowadays, increasing the volume of data and the number of attributes in the dataset has reduced the accuracy of the learning algorithm and the computational complexity. A dimensionality reduction method is a feature selection method, which is done through filtering and wrapping. The wrapper methods are more accurate than filter ones but perform faster and have a less computational burden. With ...

متن کامل

Feature Selection for Cancer Classification Using Microarray Gene Expression Data

The DNA microarray technology enables us to measure the expression levels of thousands of genes simultaneously, providing great chance for cancer diagnosis and prognosis. The number of genes often exceeds tens of thousands, whereas the number of subjects available is often no more than a hundred. Therefore, it is necessary and important to perform gene selection for classification purpose. A go...

متن کامل

SFLA Based Gene Selection Approach for Improving Cancer Classification Accuracy

 In this paper, we propose a new gene selection algorithm based on Shuffled Frog Leaping Algorithm that is called SFLA-FS. The proposed algorithm is used for improving cancer classification accuracy. Most of the biological datasets such as cancer datasets have a large number of genes and few samples. However, most of these genes are not usable in some tasks for example in cancer classification....

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Processes

سال: 2023

ISSN: ['2227-9717']

DOI: https://doi.org/10.3390/pr11071940